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ABSTRACT 


Alternative time- and frequency- domain equations are presented for predicting the 
loudness of a wide variety of statistically stationary and nonstationary sounds, either 
continuous or discontinuous. Zwislocki's theory of temporal summation and S. S. 
Stevens' psychoacoustic conversion law are incorporated in the present mathematical 
theory. Frequency domain formulas of Zepler and Harel for impulsive sonic booms and 
Jones for steady noise represent specializations of the present formulas. For sinusoi- 
dal inputs, modified Fletcher -Munson auditory response curves are predicted. For an 
impulsive input the measured response is also predicted. 



PHENOMENOLOGICAL THEORY OF LOUDNESS 
by Walton L Howes 
Lewis Research Center 

SUMMARY 

A unified theory is derived which should permit the loudness of most sounds, continuous and 
discontinuous, to be predicted from known time or frequency characteristics of the sound. It is 
assumed that the input sound intensity averaged over a finite time is uniquely related to loudness. 
This relation is modified to include operational processes by which the human auditory system 
converts intensity into loudness. The processes by which the input pressure signal is transmitted 
to the brain are assumed to be linear. However, the conversion from a physical (neurological) 
signal into psychological response (loudness), which occurs in the brain, is nonlinear. 

Physically, the input sound pressure wave is linearly converted in successive steps into an 
electrical wave, which reproduces the original waveform, by hair cells within the organ of Corti. 
Next, the auditory nerve endings respond to the time rate of change of this current rather than to 
the current itself. The resulting information is transmitted to the brain along the auditory nerve. 
This information is evaluated in the brain and subjectively interpreted as loudness. 

Mathematically, the time domain representation of the current output of the hair cells is a 
Fourier convolution of the impulse response of the entire preceding system with the original input 
sound. The response of the auditory nerve endings corresponds to time differentiation of the 
Fourier convolution. The signal transmitted to the brain contains information regarding ” elec- 
tric power, ” which is assumed to be uniquely related to loudness. The loudness is a function of 
a finite-time integral of the power. The conversion from physical output to psychoacoustic re- 
sponse is accomplished by using S. S. Stevens’ psychoacoustic conversion law. The frequency 
domain representations of these processes are derived by using Fourier series and transforms. 
The fact that the loudness is a function taken over finite times implies that the frequency repres- 
entations can be written in terms of ’’running" Fourier transforms. The complete history of the 
loudness is predicted. 

The complete auditory system must act as a nonideal, band-pass, filter. The response of a 
selected filter characterizes the system. Part of the selected response function may be attrib- 
uted to the time differentiation process. The rest of the function is a generalization of that ob- 
tained by Zwislocki in his theory of temporal auditory summation. Thus, Zwislocki’s theory is 
implicit in the present one. 

The present theory was tested using two fundamental inputs, sine waves and impulses. For 
sine wave inputs the theory predicts the Fletcher- Munson frequency response curves minus a dif- 
fraction correction for the disturbance created by the human head. For impulsive inputs the 
theory predicts loudness proportional to the intensity, as measured. 

Frequency domain formulas of Zepler and Harel for impulsive sonic booms and Jones for 
steady noise represent specializations of the present formulas. 



INTRODUCTION 


Well-known empirical methods exist for predicting the loudness of certain statisti- 
cally stationary sounds (refs. 1 to 3), that is, of certain sounds whose statistics are in- 
dependent of time. Methods for predicting the loudness of certain statistically nonsta- 
tionary sounds may be less well known (refs. 4 to 6). No means exists for predicting the 
loudness of all sounds of either statistical class. Moreover, no single scheme has been 
shown to predict correctly the loudness of some sounds in both classes. The main de- 
terrent in developing a unified theory of loudness appears to be an impression that the 
complete auditory system is so complex that a concise mathematical representation of 
the entire system is not feasible (ref. 6), and that, because the psychoacoustic response 
to an acoustic input is highly nonlinear, Fourier analysis is not readily applicable to the 
entire system (refs. 6 and 7). Thus, there would seem to be an inherent difficulty in re- 
lating time and frequency representations of psychoacoustic response to a given acoustic 
input. The main purpose of this report is to show that a practical unified theory of loud- 
ness based on Fourier methods is possible and that the theory proposed herein leads to 
predictions in good agreement with experiment. 

The theory to be described resulted from a desire to obtain alternative time and 
frequency descriptions of the loudness of sonic booms produced by supersonic aircraft. 
Such a theory might be useful in determining the extent to which undesirable human re- 
sponse to sonic booms could be minimized by controlling the boom pressure signature 
(ref. 8). The present theory appears to have much broader validity than originally in- 
tended. 

There are at least three forms of theory which might be developed, namely, one 
based on the physics and psychophysiology of the ear, nervous system and brain, a phe- 
nomenological theory in which the major elements of the complete auditory system are 
represented by simplified mathematical models, or a completely empirical theory in 
which each input is directly associated with an ultimate output response as determined 
experimentally. The first form of theory is likely to get bogged down by physical and 
mathematical complexities. The last form of theory (empirical) is likely to be imprac- 
tical because new response tests would have to be performed for each new waveform. It 
is not likely that such a theory would improve understanding of the hearing process. 
However, in cases where very rapid pressure changes are the overwhelming determinant 
of response (as in the case of sonic booms), the empirical approach may still prove com- 
pletely satisfactory for engineering calculations. Herein the phenomenological approach 
has been adopted with the hope that it will lead to reasonably accurate estimates of re- 
sponse to a variety of input signatures based on simple mathematical representations of 
the operational characteristics of the human auditory system. 

The complete auditory system consists of three principal elements: the ear, the 
nervous system, and the brain (fig. 1). The primary operational functions of the ear, 
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Figure 1. - Auditory system. 

nervous system, and brain are assumed to be, respectively, pressure amplification; 
physical conversion, filtering and time differentiation; and autocorrelation and "psycho- 
physical conversion" (fig. 2). The "physical conversion” is from a sound wave to 
mechanical pressure to a hydrodynamic wave and, finally, to an electrical wave. "Fil- 
tering" simply implies that all the energy of the incident sound waves is not transmitted 
to the brain. "Autocorrelation" concisely describes the mathematical process of inte- 
grating the "power" with respect to time. The term "psychophysical conversion" is in- 
tended to imply the conversion of a signal magnitude from objective, physical measure to 
subjective, psychological measure; that is, from physical intensity to loudness in the 
case of statistically stationary sounds. 

In the ear (refs. 3 and 9 to 11) (see fig. 1) the sound pressure fluctuations in the at- 
mosphere are mechanically amplified by the eardrum and ossicles into hydrodynamic 
pressure waves within the cochlea. The conversion into hydraulic waves occurs at the 
oval window. Within the cochlea the hydraulic pressure waves are further converted into 
electrical waves by hair cells in the organ of Corti. These waves are reproductions of 
the original sound pressure waveform (ref. 3, p. 109f). At the auditory nerve endings, 
the electrical waves are then encoded as electrical impulses of uniform amplitude which 
are transmitted to the brain through a bundle of nerve fibers comprising the auditory 
nerve. It appears from the uniform amplitude pulse code signals that the auditory nerve 
can be regarded as a lossless transmission line. The amplitude of the electrical wave 
must exceed a certain threshold value in order to produce an impulse in the auditory 
nerve. But most importantly the time rate of change of the electrical signal determines 
the number of nerve fibers along which the impulses will be transmitted (ref. 3, p. 112). 
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The continued change of wave amplitude produces successive impulses in each nerve 
fiber. The number of fibers which transmit impulses to the brain determines the loud- 
ness of the original sound, as subjectively interpreted in the brain. It seems reasonable 
to expect that the concept of ’'electric power” (output from the hair cells) can be associ- 
ated with one aspect of the information transmitted to the brain and that this power inte- 
grated over a finite time duration is uniquely related to loudness. The well-known time 
integration of the signal probably occurs in the brain. 

The preceding paragraphs outline the processes that will serve as the basis for a 
theory of loudness. Although the various mathematical operations will be associated 
with specific elements of the auditory system, possible incorrect associations (ref. 11) 
are not likely to affect the theory as long as the assumed operations do occur essentially 
in the order described. 

The theory will be developed according to the following procedure. The loudness is 
assumed to be uniquely related to the sound intensity integrated over a finite time, the 
auditory integration time. This average sound intensity is expressed as a function of the 
sound pressure history (time domain), or, alternatively, as a function of the sound pres- 
sure spectrum (frequency domain). (The time and frequency domain analyses will be 
presented consecutively, rather than in parallel. ) Next, the operational characteristics 
(pressure amplification, physical conversion, filtering, and time differentiation) of the 
auditory system are introduced. Information regarding the original sound intensity ulti- 
mately appears in the brain as information regarding the finite-time-average electric 
power reaching the auditory nerve. This power is expressed mathematically in both 
time- and frequency-domain representations. The power formula is made more explicit 
by specifying the filter characteristics of the auditory system in analytical form. Fi- 
nally, the psychoacoustic response called loudness is related to the power by applying 
Stevens' law (ref. 12). This completes the process of relating the sound pressure to 
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Form of signal; 
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Sound Mechanical Hydrodynamic Electric current 

pressure pressure pressure 


Electric Subjective loudness 

impulses 
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(high-pass filter) physical conversion 


In time domain: p(t) 
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Figure 2. - Proposed model of auditory system. 
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loudness through a chain of operations which presumably occur in the auditory system 
and brain. 

The succession of auditory components, the operational processes they perform, 
and the corresponding physical and mathematical representations of the processes are 
diagramed in figure 2. 

PHYSICAL QUANTITIES RELATED TO LOUDNESS 

It is assumed that the subjective psychoacoustic quality called loudness is a single- 
valued function of the finite-time- aver aged intensity of the sound input at the ear. 

S. Lifshitz (ref. 13) appears to have been the first to propose this relation. Its validity 
is well established (refs. 1, 3, 12, 14, and 15). Specifically, the average acoustic in- 
tensity over all time (average energy flux over all time), 


* = pv n (1) 

usually serves as a physical measure uniquely associated with loudness. In equation (1), 
p is the acoustic pressure, v R is the normal component of acoustic particle velocity 
through a control surface having unit area, and the overbars denote infinite time aver- 
ages. (All symbols are defined in appendix A. ) At distances from the sound source 
which are large in comparison with the extent of the source, equation (1) is approximated 
by the well-known plane- wave relation 



pc 


( 2 ) 


which in more detailed form is written as 

* = lim ^ — - — \ \p(t) | 2 dt (3) 

°° \ 2pc$~ y-g- 

where t is time, p is the atmospheric density, and c is the speed of sound. For sta- 
tionary sounds, equation (3) determines an adequate physical measure of loudness. How- 
ever, for momentary sounds, such as sonic booms, the intensity 4' averaged over all 
time is an unsatisfactory physical measure of loudness because 4' may vanish. Even 
for statistically stationary sounds, practical necessity requires that the averaging time 
be finite, hi audition a close approximation to 4', both physically and psychologically, 
is obtained for averaging times less than a second. Let 4 f denote this average, where 
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the tilde indicates that the average is taken over a finite time duration tj. Then the av- 
erage intensity 



is a practical physical measure of loudness for continuous sounds, regardless of their 
time dependence (i. e. , statistics). For statistically stationary sounds, is independ- 
ent of t. 

Equation (4) as it stands cannot be correct, or at least complete. To illustrate this, 
consider the following example. Suppose that throughout an arbitrary auditory integra- 
tion interval tj, p(t) = Constant 4 0* Equation (4) indicates that the auditory response 
would be non vanishing and, hence, that auditory response occurs. In fact, auditory re- 
sponse does not occur in this circumstance. Thus, equation (4) must be incomplete, or 
incorrect. This difficulty will be eliminated when the operational characteristics of the 
human auditory system are considered. 


PHYSICAL INPUT-OUTPUT RELATIONS 

Equation (4) is incomplete because it does not include operational characteristics of 
the auditory system; namely, the pressure amplification induced in the middle ear, the 
physical conversion from a pressure signal to an electrical signal by the hair cells in 
the inner ear, and the response to time rates of change of electrical current by the audi- 
tory nerve endings. By assuming that these processes are linear, they can be readily 
treated analytically. (Linearity was previously assumed in the loudness theory of Biirck, 
Kotowski, and Lichte (ref. 5) and is indicated by measurements (ref. 3, p. 110).) 

The successive operations performed by the auditory system involve transfer func- 
tions which relate the input and output signal amplitudes. Because the auditory system 
does not constitute an all-pass filter, part of the energy of the input signal does not 
reach the brain. The complete auditory system acts essentially as a quasi-linear band- 
pass filter. For humans the pass band extends roughly from 20 to 20 000 hertz. Any 
linear filter is characterized by two alternative quantities; namely, the frequency re- 
sponse function H(u>) and its Fourier transform, the impulse response function h(t). 
Specifically, 


H(u>)= f 

j- oo 


h(t)e" iu>t dt 


(5a) 
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which may be denoted by 


-Hi)/: 


H(u>)e ia>t dco 


(5b) 


h(t) — H(co) (6) 

where w is the angular frequency. The frequency response function H(o>) describes 
the filter output for a sinusoidal input; the impulse response function h(t) describes the. 
filter output for an impulsive input (delta function). 


Time-Domain Analysis 

The hydrodynamic pressure fluctuations in the cochlea are assumed to be propor- 
tional to the input atmospheric pressure fluctuations. Next in succession, the electrical 
output of the hair cells in the inner ear is assumed to be proportional to the hydrody- 
namic pressure fluctuations. These proportionality constants can be lumped into a 
single constant k. Hence, for an arbitrary sound pressure input p(t) to this linear sys- 
tem, the resulting electric current output j(t) from the hair cells is simply given 
(ref. 16, p. 83) in the time domain by 


j(t) 


-«/' 

OO 


h(T)p(t - r)dT 


(7a) 


j(t) 



h(t - r)p(r)dT 


(7b) 


which are alternative expressions. The current j(t) reaches the auditory nerve endings. 
But the auditory nerve endings respond to the time rate of change of this current (ref. 3, 
pp. 112, 259), rather than to the current itself. Thus, the information transmitted to 
the brain along the auditory nerve concerns 

— j(t) = k f h(r) — p(t - r)d r (8a) 

dt J _ „ at 


— j(t) = k f — [h(t - r)]p(r)dr (8b) 

dt at 


7 



rather than j(t). 

At the outset it was emphasized that loudness is uniquely related to the intensity of 
the sound input. After the hydrodynamic wave is converted into an electrical wave, elec- 
tric power n, which is proportional to the acoustic intensity 'J', replaces intensity as 
the appropriate physical measure of loudness. The output of the hair cells is a repro- 
duction of the original sound waveform. Thus, in parallelism with equation (4), the 
finite-time-average electric power output of the hair cells II c (t) is given by 



where an electrical resistance R has been introduced to give the equation dimensions of 
power. When the response characteristics of the auditory nerve endings are included, 
equation (9) must be replaced by 


n(t) = Rtj 




2 

dr 


( 10 ) 


where ^ j(r) is given by equations (8). Information regarding this power is trans- 
mitted to the brain along the auditory nerve. The mode of transmission is essentially 
lossless and need not be specified in the present theory. 

In summary, equations (8) and (10) relate the input sound pressure to the electric 
power reaching the auditory nerve. Information regarding this power is transmitted to 
the brain, wherein the information is interpreted as possessing the subjective quality 
called loudness. 

The preceding theory is specified in the time domain. The corresponding results 
will now be derived for the frequency domain. 


Frequency-Domain Analysis 

In the frequency- domain representation, the electric power is to be expressed as a 
function of the sound pressure spectrum P(o>) , which is the Fourier transform of the 
sound pressure history, or signature p(t). hi other words, 

P(a>) ~ p(t) (11) 
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Because the auditory integration period tj is finite, the nature of the sound pressure 
spectrum after passage of a finite time may sometimes be of interest. This so-called 
"running" spectrum P(co, t) is given by (ref. 16, p. 148f) 



(12a) 


or 

P(w, t) — 0(t - r)p(r) (12b) 

where 

f 1 (t < t) 

0(t-r)^ (13) 

lo (t > t) 

Also, 

|0(t - T) | 2 = 0(t - T) (14) 

The running pressure spectrum P(co, t) is a function of P(co) (ref. 16, p. 149). Specif- 
ically, 

P(co,t) = f ^6(y - co) - i(y - co)" 1 jp(y)e 1 ^ y " a, ^ t dy (15) 


relates the two spectra, where 6 is the unit impulse function. In particular, 

P(co, t = oo) s P(co) (16) 

Similar relations apply for the current. Thus, the current possesses a spectrum J(co) 
given by 

J(«) — J(t) (17) 


as well as a running spectrum, 

j(o>,t) *-» d(t - r)j(r) 


(18) 
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Equations (15) and (16), with P replaced by J, also apply. 

In the time-domain representation, the current is related to the pressure according 
to equation (7). The corresponding frequency-domain representation is (ref. 16, p. 86) 

J(w) = kH(co)P(oj) (19) 

As shown in appendix B, the running current spectrum may be expressed in terms of the 
response functions and pressure input by 


J(u>,t) 


-*/* 

* / - on 


H(w, t - T)p(r)e 1<J)T dr 


(20a) 


J(co, t) 




h(r)P(w, t - r)e" 1WT dr 


(20b) 


where 


H(w,t) — d(t- r)h(r) (21) 

and H(u>, t) is the running frequency response function. In parallel with equation (16), 
H(u>, t = <») = H(u>). More generally, equation (15) applies with P replaced by H. 

It is noteworthy that the integrands in equations (20) consist of products of time and 
frequency functions. If t — °°, and h(r) and p(r) — 0 as r — °°, equations (20) reduce 
to equation (19). Even more importantly from the standpoint of steady noise, if 
H(w, t - r) is effectively constant throughout the interval (-«>, t), except for a short time 
t - r « tj (where, as before, t^ is the auditory integration period), equations (20) re- 
duce to 


J(w, t) « kH(o>)P(u>, t) (22) 

which involves the running spectrum P(u>, t), rather than the ordinary spectrum P(co) 
contained in equation (19). 

In determining the electric power, the time derivative of the current, rather than 
the current itself, is most significant. The frequency- domain representation of this 
derivative is given by the Fourier transform 

- j(t) — icoj(co) (23) 

dt 
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However, the loudness is determined by the electric power integrated over a finite time 
interval t^ (cf. eq. (10)). Hence, the running transform of the time derivative applies. 
The running transform is given by 

— [0(t - T)j(r)] — iwJ(u), t) (24) 

dr 


In expanded form, namely 

— [0(t - r)j(r)] = 0(t - r) — j(r) - 6(t - T)j(r) (25) 

dT dT 

the left-hand side of equation (24) is seen to include an impulsive transient associated 
with switching off the integration. The existence of this transient is independent of the 
specific time dependence of j(t). Hence, the transient is not determined by the input. 
The transient is not associated with the human auditory system because the human mind 
detects no subjective loudness of an impulsive nature associated with initiation or termi- 
nation of the auditory integration process. The auditory system remains in the "on” 
state all the time. Thus, this "switching" transient is unphysical and should be omitted 
in computing the power. The switching transient is purely a consequence of the mathe- 

_i 

matics of the Fourier transform and does not appear if ^ j(t) is represented by a 
Fourier series expansion over the period t^. Finally, equation (10), the original time- 
domain equation for the power, does not contain switching transients. 

With the switching transients eliminated it is shown in appendix C that, in the 
frequency-domain representation, the electric power is given by 

ff(t) = ^ £2 [j J(co,t) | 2 - | J(co,t - t x ) | 2 ]co 2 dco (26) 

which, when accompanied by equations (20), expresses the average power as a function 

of the input pressure history or its spectrum. Equation (26) exhibits a high-frequency 
o 

weighting (oca> ) of the average power by virtue of the nerve endings' response to the 
time derivative of the current. Thus, the auditory nerve endings act as high-pass 
filters. 

j 

The integral in equation (26) may be divergent. To avoid this difficulty, ^ j(t) in 
equation (10) may be expanded as a Fourier series over the integration interval. The 
power is ultimately expressed by the sum of the squares of the Fourier coefficients. 

This approach is especially useful when ^ j(t) is periodic. The case where p(t) is a 
pure tone is considered in appendix D. If p(t) is unknown and information is available 
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regarding the spectrum, but the integral in equation (26) is divergent, then methods de- 
scribed by Bennett (refs. 17 and 18) may be used to evaluate the power. 

The resulting power formulas may now be summarized. In the time-domain repre- 
sentation, 


= A., r f 
a ,,/ 1 r 


A A. 


h(r) — p(r - r)dr 
dr 


dr 


— [h(T - r)]p(r)dr 

dr 


dr 


(27a) 


(27 b) 


If h(t) and p(t) are both real, the absolute value signs in equation (27) can, of course, 
be removed. In the frequency- domain representation, 


n(t) = 


K 2 Rt 


2ir 


x -/: 


f H(u>,t - r)p(r)e~ ia,T dr 


/t-tj 

/ H(u>,t - tj - r)p(r)e” ia,r dr 


'ur dw (28a) 


n(t) = 


K 2 Rt 


2n 




Z h(r)P(u>, t - rje"^ 7 dr 


f h(r)P(u>,t - tj - r)e~^ WT dr 


du> (28b) 


If the sound input is steady noise and H(u>, t - r) is effectively constant, except 
during the short time interval t - r « tj, then equation (22) is valid. Therefore, 


ff(t) W “^^C°[l P(&,,t) | 2 " l P(W,t- t i>! 2 ]l H ( Ct, )| 2a,2dw (29a) 
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or 


n(t) 


fc 2 Rt 

277 


|P( w ,t - t x ) 1 2 ] I H c (co) | 2 dw (29b) 


follows from equation (28a), where, by definition, 

H (w) = iu)H(w) 

V-/ 

and 

h (t) = h(t) 
c dt 

h c (t) — h c m 


(30) 


(31) 

(32) 


If h(t) and p(t) are quasi-impulsive, that is, if h(t) and p(t) are nonvanishing only 
over a time period shorter than the integration time t^, then equations (28) simplify. 
Suppose that p(t) is initiated at some time t and is quasi-impulsive. Assume that t 
occurs within the auditory integration interval, that is, t - tj < t < t. Then, the sec- 
ond time integral in equations (28) vanishes. Assume that p(t) effectively vanishes per- 
manently again at some later time t^ less than the upper limit of integration, that is, 
t < t^ < t. Then, if the running frequency response H(o>, t) effectively becomes inde- 
pendent of time for time durations t - t^ or larger, that is, if H(co, t - t^) « H(u>), then 
H(o>, t - t) may be extracted from the integrand of the first integral in equation (28a). 

The condition H(o>, t - t^) « H(u>) corresponds to h(T ^ t - t’ ) « 0, and in conjunction 
with causality (to be discussed) implies that h(t) is quasi-impulsive. The remaining 
integral equals P(w, t). But, because p(t) effectively vanishes for t > t^, it follows 
that P(co, t) « P(u>). Hence, equation (28a) reduces to 

~ At 

n ~ 

277 


/ 

on 


H(w)P(w) 2 co 2 do; 


(33a) 


or 


n 


At, /-°° 9 

- / |H (co)P(w)| 2 du> 

2 77 °o 


(33b) 
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Equation (28b) can be reduced to equations (33) by a similar argument. 

In general, the response characteristics of the complete auditory system are math- 
ematically entangled with the input pressure. However, in the special case of equations 
(29) and (33), the response characteristics are separable from the input. Thus, the 
quantities h„(t) and H„(co) represent the response functions for the complete auditory 
system preceding the auditory nerve. 


CAUSALITY AND THE PALEY -WIENER THEOREM 

In the preceding section the electrical power output was related to the sound pres- 
sure input by introducing response characteristics of the auditory system. The next 
step is to specify the response characteristics explicitly. However, before doing so it 
is important to consider the consequences of causality and the Paley- Wiener theorem in 
this regard. The Paley- Wiener theorem not only has a bearing on the response charac- 
teristics but also provides an important result regarding the spectra of short duration 
sounds. 

First, consider causality. Cause precedes effect. Thus, if p(t) is initiated at 
some time t = t , then j(t) = 0 for t < t Q . Without loss of generality, one can set 
t Q = 0. 

A function which vanishes for t < 0 is called causal (ref. 16, p. 13). The impulse 
response h(t) is causal (ref. 16, p. 85). Hence, H(u>, t) is also causal. It follows from 
equations (7) and (20), respectively, that j(t) and J(u>, t) must be causal. 

Causality in conjunction with the Paley- Wiener theorem (ref. 19, p. 16ff or ref. 16, 
pp. 215-217 and 222) leads to important consequences regarding the auditory response 
characteristics as well as the spectra of finite duration sounds. The Paley- Wiener theo- 
rem states that, if 



OO 


In | F(o>) | 

1 + 


du> < oo 


(34) 


where F(u>) is square integrable, that is, 


r i wi 2 

•/- OO 


dw < °o 


(35) 


then f(t), which is the Fourier transform of F(o>), is causal. Note that, if F(o>) van- 
ishes over any nonvanishing interval Aa>, the inequality (eq. (34)) is violated. There- 
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fore, the spectrum of any causal function must be nonvanishing at all, except possible 
discrete, frequencies if the spectrum satisfies the inequality (eq. (35)). 

Let the auditory impulse response h(t) correspond to f(t) and the frequency re- 
sponse H(co) correspond to F(co). From the Paley- Wiener theorem it follows that, be- 
cause h(t) is causal, the auditory frequency-response function H(co) must be nonvanish- 
ing for all, except possible discrete, frequencies. This means that the auditory re- 
sponse cannot be represented by a simple ideal band-pass filter (ref. 20). 

Let the electric current j(t) correspond to f(t) and the current spectrum J(co) cor- 
respond to F(to). Then, because j(t) is causal and J(co) is square integrable for finite 
signal amplitudes, the electrical output of the hair cells associated with any sound input 
must include all, except possible discrete, frequencies in the audible range. 

Finally, let the sound pressure p(t) correspond to f(t) and the pressure spectrum 
P(co) correspond to F(co). If p(t) is causal and P(co) is square integrable, P(w) must 
be nonvanishing for all, except possible discrete, frequencies. The sonic boom repre- 
sents an important sound satisfying the Paley- Wiener conditions. Hence, the production 
of an inaudible sonic boom is impossible. This conclusion is reached without even taking 
account of the auditory system. 

In the preceding applications of the Paley- Wiener theorem the results are valid in 
principle. However, the amplitudes of the response functions or signals have not been 
considered. By definition, the response functions vanish outside the audible frequency 
range. Inside the audible range the signals may be too weak to produce any response. 


EXPLICIT FILTER CHARACTERISTICS OF THE AUDITORY SYSTEM 

The filter characteristics of the auditory system can now be specified explicitly. In 
equations (30) and (31) the response characteristics of the complete auditory system, 
namely h £ (t) and H c (co), have been specified as functions of h(t) and H(w). Because 
the auditory system is assumed to perform linear filtering, it represents a stable sys- 
tem in the sense that its response to any bounded input is bounded. This implies that 
h c (t) is absolutely integrable, that is, 

f°° \ h(t)|dt<- (36) 

OO 

or h (t) — 0 faster than 1/t as t - If h (t) is absolutely integrable, H (o>) — 0 as 
c c c 

I oo | — oo by virtue of the Riemann-Lebesgue Lemma (ref. 21), namely 
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( 37 ) 


lim f h (t)e _lWt dt = lim H (to) = 0 
| (/) | -*oo J- oo C | to | _»oo C ' 

Because the human sensory system is excited only by time-dependent inputs, it follows 
that H c (0) = 0. The quantity H c (co) must, therefore, describe a band-pass filter. The 
human auditory system as a passive filter is causal. The causality condition in conjunc- 
tion with the Paley- Wiener theorem indicates that the auditory response cannot be char- 
acterized by an ideal band-pass filter because the impulse response of such a filter is 
acausal. The impulse response at time t = 0 is given by 

h (0) = - f H ' (u)da> (38) 

c n * / 0 L 

which follows from equation (32) and causality and where Q.& denotes the real part. 
Because the auditory response is non vanishing in the audible frequency pass- band, it 
follows that Qe | H c (to) | > 0 for 0 < to < °° because /?#H c (c o) cannot change sign. 
Hence, 


|h c (0)|>0 (39) 

To summarize: In attempting to relate its time- and frequency-response characteristics, 
the complete auditory system as a linear system corresponds to a nonideal, band-pass, 
filter. The associated response to an infinite impulse is necessarily nonvanishing at the 
instant the impulse is applied, but vanishes faster than 1/t as t — °°. 

The preceding arguments apply with respect to the running frequency response 
H (co, t), as well as with respect to H (co). Thus, for example, in terms of H (to, t), 

C L L 

m - r)h c (r) — H c (co, t) (40) 

leads to 

h c W ~ H c ( w ’ °°> ( 41 ) 


Hence, 


h c (t) — 2 /?6H c (co, oo) s 2 /?*H c (co) (42) 

and equation (38) follows by setting t = 0. 
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I Response functions of the type described above are commonly associated with multi- 
s1;!age amplifiers (ref. 22) wherein the frequency-response function is represented by the 
ratio of polynomial functions of frequency so that the impulse response is, then, neces- 
sarily described by decaying exponential functions of time. An appropriate Fourier 
transform pair is 


H c (o>) = iAo^o^ + iw) -1 (u> 2 + ico) -1 
in the frequency domain, which corresponds to (ref. 23) 


(43) 


h c (t) = 0(t)A(o> 2 



(44) 


in the time domain, where A is a constant and and o> 2 are the pass- band cutoff 
frequencies. These cutoff frequencies are properties of the auditory system. Predict- 
ing the values of these frequencies requires a more detailed physical analysis of the 
auditory system than that provided herein. Moreover, the cutoff frequencies, by defi- 
nition, are not directly measurable but can only be estimated from auditory transmit- 
tance curves. From equations (30) and (43) it follows that 


H(o>) = A(u)j 


ico) 1 (w r 


+ ico) 


-1 


which is the transform of 


h(t) = 0(t)A(u> 2 




(45) 


(46) 


When Wj and o> 2 are evaluated, it is found, in fact, that the impulse response h(t) is 
quasi-impulsive. Equation (46) is identical to the impulse response formula for nerves, 
as derived from physical arguments by Zwislocki in his "Theory of Temporal Auditory 
Summation" and confirmed by prior experiments of Galambos on medullary nerves 
(ref. 7)„ Equations (45) and (46) describe a low-pass filter. This in conjunction with the 
high-pass weighting - due to the auditory nerve endings' response to pressure rates of 
change, rather than pressures - in equation (26) causes the complete auditory system to 
perform band- pass filtering, as indicated by equations (43) and (44). 

The explicit equation (46) for h(t) may now be introduced in equations (27) to pro- 
vide the final time-domain formulation for the electric power. Thus, 
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2 


ff(t) = k 2 A 2 (u> 2 - a> 1 ) -2 Rt 1 


— p(r - f)df 


n(t) = k 2 a 2 (w 2 - 


ft 

/ t r T \ -« 2 (t-t) -a) 1 (r-f)*l A 

/ w 2 e 2 -Wje p(r)dr 

L - Ll *s-cc 


dr (47 a) 


dT 


(47 b) 


Only the sound pressure history and values of and o> 2 need to be specified in order 
to complete the calculation of n. 

i~ 

To complete the frequency-domain formulation for II, the running response function 
H(u), t) must be written explicitly for potential introduction in equation (28a). By virtue 
of equations (21) and (46), 


H(w, t) = -M&L 


u> 2 - u>i 


, -(co 1 +iw)t , -(w 9 +io))t 

— — e 1 + e 


u> 2 - |Ju> 2 + io>)(u>i + iui) o)j + io) 


u> 2 + 1U) 


However, 


(48) 


= W 1 

>♦ 

ico \ 

— 

\ 

V 


r- 

/ \ 2 1 

= COj 

1 + 

(-f) 


_ 

h/J 


i0i 


(49) 


where 


0 1 = tan * — 
W 1 


(50) 


and similarly for u> 2 + io). Then, H(o), t) may be written in a form in which its real and 
imaginary parts are easily recognized, namely 
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By comparing the time-dependent terms in equation (51) with the time- independent term 
it can be shown that the time-dependent terms are generally significant over a time less 
than 10 percent of the integration period t^. Hence, when considering steady noise, the 
effect of the time -dependent terms in H(o>, t) can be neglected, so that equations (29) are 
valid. 

More generally, a frequency- domain formulation for II is obtained by substituting 
the preceding expression for H(o>, t) in equation (28a). The result is lengthy and will not 
be written down. A simpler -looking alternative is obtained by substituting the expression 
for h(t) given by equation (46) into equation (28b). The result is 


Rt ! o o f°° 


/»oo 

/ / " w o T \ -XWT 

n(t) = — i (kA)V 2 - c^r 2 / 
2 7T / 


/ (e 1 -e * )P(w,t-r)e dT 

«/-oo 






-iu>T 

P(u>, t - t^ - T)e 


dr 



dco 


(52) 


In the special case where h(t) and p(t) are quasi- impulsive, equations (33b) and (43) 
apply. Then, 


ff 



co 2 do) 

[l + (co/co/] [l + (w/u> 2 ) 2 ] 


(53) 
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PSYCHOACOUSTICAL RELATIONS 

Expressions have been obtained for the electric power information presumably re- 
ceived by the auditory nerve as a function of the input pressure signature or spectrum 
and the physical characteristics of the auditory system, hi the brain the information re- 
garding this objective physical power is converted into subjective responses, such as 
loudness and annoyance, as well as into other objective and subjective responses, some 
of which may be classified as startle responses. Only loudness will be evaluated herein, 
but the other subjective responses may be evaluated in a similar fashion (ref. 24). 

S. S. Stevens has shown (refs, 24 and 25) for a wide variety of psychophysical phe- 
nomena that, if (p is the magnitude of a physical stimulus and ^ is a psychological 
magnitude (determined by subjective judgments), then 

= k<p m (54) 

where k and m are constants dependent on the phenomenon and often on the individual 
as well. Equation (54) may be aptly called Stevens' law. It supplants the well-known 
Fechner law, 


xp = kj In cp (55) 

which is experimentally invalid (ref. 26). 

If Stevens' law is applied to loudness (ref. 12), then 

&=k Q n l (56) 

where & is the loudness (sones) and k Q and l are constants. A loudness level L 
(phons) was defined by Stevens (ref. 12) as 

L = 33. 3 log & + 40 (57) 

for an input frequency of 1000 hertz. For other frequencies the coefficients may be dif- 
ferent. By considering equation (56) and noting that n is a function of p and u>, a 
more general equation for the loudness level is 

L(p, u>) = I log + i log 9M - L («) (58) 

P Qn 
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where G(p) and Q(w) are, respectively, functions of p and o> to be determined from 
n; G 0 and Q q are reference values; the constant Zp determines the loudness level 
rise rate as a function of sound pressure; the constant l determines the rise rate as 
a function of o>; and L m (w) is a function of frequency which accounts for the fact that 
the detection threshold occurs at a nonvanishing sound pressure. (Note that log sig- 
nifies logarithm to the base 10. ) The condition ^ would imply that the psycho- 
acoustic conversion is frequency dependent and, hence, that the brain introduces an ad- 
ditional filtering effect. Similar formulas could be given for noisiness or annoyance and 
other psychoacoustic phenomena. Equation (58) finally quantitatively relates subjective 
loudness judgments to the physical sound- pressure input. 

For loudness levels L greater than 40 phons (up to at least 110 phons) at 1000 
hertz, doubling the loudness Sf corresponds to a 10-phon increase in the loudness level 
(ref. 3, p. 193; ref. 12), a 10-decibel increase in intensity level T (ref. 3, p. 186f; 
ref. 12), or tripling the sound pressure p. For loudness levels less than 40 phons, the 
relation between loudness and loudness level is not so simple. In this range the loudness 
varies to a much greater degree as a function of loudness level (ref. 3, p. 193; or 
ref. 4). 

The minimum detectable intensity level change is generally between 0. 25 and 1.0 
decibel (3- to 12-percent change in sound pressure ratio) for sine waves at intensity 
levels greater than 30 decibels, and considerably greater at lower intensity levels 
(ref. 3, p. 146). For impulsive sounds the minimum detectable intensity level change 
appears to be somewhat greater. For example, for sonic boom N waves, changes less 
than 2 decibels, or 25 percent in sound pressure, are apparently undetectable (ref. 27). 


PARTICULAR SOLUTIONS 

In calculating the loudness of various inputs the choice of a time- or frequency- 
domain calculation is decided simply on the basis of ease of calculation. At the very 
least, if the preceding theory is valid, it must be capable of predicting the loudness 
levels of pure tones and impulsive inputs. Thus, these examples will serve to illustrate 
the initial applications of the theory. 


Pure Tone Input 


Assume that 


P(t) = P Q cos(co Q t - A) 


(59) 
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where p Q is the pressure amplitude at the ear, o>q is the driving frequency, and A is 
a phase shift. As shown in appendix D, 


n(t) = 


AtiP^ 


I H( w 0 ) | 2 tj - Re -L. H 2 (co 0 )e 2l(W()t A) ^l - e" 21 " 0 ^ 


(60) 


If 


t, »— (61) 

equation (60) reduces to 

n=i/c 2 Rt 2 p 2 w 2 |H(a> 0 )| 2 (62a) 

2 

ff = i K 2 Rt 2 p 2 |H c (« 0 )| 2 (62b) 

2 


that is, the time-average power is proportional to the pressure intensity, the transmit- 
tance of the complete auditory system, the square of the averaging time, and is inde- 
pendent of t. Inequality (eq. (61)) is, in fact, satisfied by the auditory system. Ac- 
cording to Steudel (ref. 4; ref. 14, p. 158), t^ » 3xl0 - ^ second. The integration time 
tj determined by Steudel is incorrect because the input signature, an impulse, was not 
of sufficient duration to determine the true integration time. More recent measure- 
ments (refs. 6 and 28 to 30) indicate that the integration time is more nearly 0. 1 second, 
with possible dependence on intensity. Von Bekesy adopted (ref. 11) the value 
tj = 0. 2 second. Using this value, equations (62) are valid if » 5, that is, if the 
driving frequency is much greater than 1 hertz, say 10 hertz or greater. 

In order to introduce the result given by equation (62) in equation (58), let 


G(p) = - k 2 RPq 
2 0 

(63) 

G o 4* 2rp ? 

(64) 
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(65) 


Q(a>)= |H c (a> 0 )| 2 


% = |H C (V| 2 < 66 > 

-4 -2 

where p r is a reference sound pressure (p r = 2x10 dyne cm ), and the frequency- 
response function H(co) peaks at the frequency u> m . By introducing the preceding re- 
lations in equation (58) the loudness level of a pure tone is, therefore, given by 


L o (p o> w n) = i. 



+ log 


l*Wl 2 

l H c<“m >| 2 


- L (o) ) 
m v m' 


(67a) 


Finally, by virtue of equation (43), 


(67 b) 

Note that the second term on the right-hand side of equations (67) vanishes if 

The dependence of the loudness level L q obtained from equation (67b) on sound 
pressure and frequency should correspond with Stevens and Davis’ response curves 
(ref. 14, p. 124; or ref. 15, p. 201). These curves, derived from data by Fletcher and 
Munson (ref. 1; ref. 3, p. 188), were obtained by using pure tones supplied by ear- 
phones with sound pressures measured near the eardrum. Intensity levels are presented 
as a function of frequency with loudness level as a parameter, where the intensity level 
T is defined by 


Vp 0 > = l P M — J + l w log< 


(V^i) 2 [ i + (% n A ) i) 2 ][i + (y>2 )2 , 

J w m / W l> 2 i 1 + ( V^l^lf 1 + < V w 2> 2 ] . 


T = 10 logp>,) (68) 

The curves of Stevens and Davis differ from the better known Fletcher-Munson curves 
(ref. 1; ref. 3, p. 188; or ref. 15, p. 200). The Fletcher-Munson curves correspond 
to introducing the observer into the sound field facing the source of sound after the free- 
field sound pressure has been measured. As a result the Fletcher-Munson curves in- 
clude diffraction of the sound by the human head. The present theory does not attempt 
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to account for this diffraction phenomenon, but rather, corresponds closely to the ex- 
perimental situation represented by the curves of Stevens and Davis. The theoretical 
and experimental results will be compared following discussion of the theoretical loud- 
ness of an impulsive input. 


Impulsive Input 


Let the sound pressure be represented by 


p(t) = jr 0 6(t) 


(69) 


where 7 r Q is the impulse. With the impulse response given by equation (46), it is shown 
in appendix D that the average power is given by 


n(t) = - Rt 1 

fl J- 




- Cl)- 

\ 2 1 / 




/ _2w 2 t \ / -2w 2 (t-ti) \1 

0(t)(l - e z )+ 0(t - t^le * - ll 


2co 1 (t-t 1 ) 


T / -2Wit\ / -2co.(t-t 1 ) V 

+ u) 1 0(t)(l-e 1 J+0(t-t 1 )le 1 - l) 


^2 
w l + w 2 


/ -(w 1+ w 2 )t\ / 

e( t) (1 - e j + e(t - tj) f e 


-(o>i+co 2 )(t-ti) 


(70) 


When t = tj the result is especially interesting because it represents the situation 
where the integration period begins with the impulse, and, hence, is the period associ- 
ated with maximum loudness. For the human auditory system u>j, a> 2 » 1/tj, so that 


~ 1 (kAtt ) 

n(tj) « - Rt 1 ° 


( 71 ) 


w l + w 2 


By applying equation (58), the maximum loudness level is found to be 


24 



( 72 ) 


/%\2 ("l + "a), 

V 7 ^ = l p log (— I + l w lo & l 6 

P \r/ w l + a) 2 1 

where ir r is a reference impulse, + <a^j is the sum of the cutoff frequencies at the 

reference pressure, and is a constant loudness level which accounts for the fact 

that the linear relation between loudness level and intensity level does not include the 
coordinate origin. 


COMPARISON WITH EXPERIMENT 

Theoretical and experimental curves of the loudness level spectra for pure tones, 
with intensity level T as parameter, are compared in figure 3. The test curves are 
cross-plots of Stevens and Davis’ equal- loudness curves. In the form displayed in fig- 
ure 3 the similarity of the curves to ordinary filter curves is evident. Theoretical 
points computed for 1/3-octave intervals using equation (67b) for sinusoidal inputs have 
been superposed for comparison purposes. 

In order to fit Stevens and Davis’ response curves, the constants Z , and u >2 
were adjusted to obtain a best fit at each intensity level. The ’’peak” frequency u> m 
was not chosen independently despite the fact that it represents an independent constant. 
Rather, o> m was as sumed to be the geometric mean of the cutoff frequencies and 
Wgj that is, w = This assumption causes the frequency-response curves to be 

symmetrical about on the log-log scale in figure 3. Despite this unnecessary re- 

striction the theoretical and experimental curves are found to be in very good agree- 
ment. The two sets of curves generally differ by less than 2 phons over the entire in- 
tensity level range of 10 to 130 decibels and audible frequency range. The greatest dis- 
parity of the results is 10 phons at the highest frequencies in the midintensity-level (70 
to 90 db) range. The 2-phon difference is certainly no greater than the errors induced 
by the instrumentation, by averaging results (loudness level probable error of the order 
of 6 db according to ref. 31) from a large number of tests (297 observations on 11 ob- 
servers), and by cross -plotting the response curves from published equal-loudness con- 
tours. In fact, the largest disparity might result primarily from the experiment, rather 
than from inadequacy of the theory. 

By fitting equation (67b) to the test data for each input intensity level a set of values 
of the constants Z w , Wj, Wg, and are obtained for each intensity level. Values of 
Zp log(p 0 /p r ) 2 - L m ( a, m ) and Z w as functions of intensity level have been plotted in fig- 
ure 4 with Zp = 10 phons. A similar plot for Wj, Wg, and u> m is given in figure 5. 
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Note that in most cases the values of the constants can be fit fairly well by straight lines 
on the log-log plots, although the scatter of the values for l is excessive. In fig- 
ure 5 it is apparent that the auditory bandwidth increases markedly as the input intensity 
is increased. However, the geometric mean frequency remains in the frequency 
interval 500 to 2000 hertz throughout the entire intensity level range of 10 to 130 deci- 
bels, the higher geometric mean frequency occurring at the lower intensity level. 

The straight-line fits to the data in figures 4 and 5 can be formulated to provide a 
set of equations which determine the frequency response over the entire intensity and 
frequency ranges covered by the data. Of course, the resulting formulas cannot be ex- 
pected to fit the experimental data nearly as well as the individual fitting process used to 
obtain figure 3. The formulas for the coefficients determined by the straight lines in 
figures 4 and 5 are as follows: 


10 log(p 0 /p r )^ = 1.0030 T phons 

(73) 

L ffl ( w m ) = 6 phons 

(74) 

= 0. 3636 T + 36. 6, phons 

(75) 

log(w 1 /27r) = -0.0158 T + 3. 1461 

(76) 

log(o> 2 /27r) = 0. 003066 T + 3. 7404 

(77) 

log(w m /277) - -0.004816 T+ 3. 3222 

(78) 


These formulas in conjunction with equation (67b) determine the loudness level L q as a 
function of the input sound intensity level T and frequency u>q. Note that the values of 
the auditory constants are actually functions of the intensity and that ^ i = 10. 

From this inequality it follows that the psychoacoustic conversion involves additional 
filtering. The frequency-response curves predicted by using the formulas above are 
compared in figure 6 with the experimental curves. The disagreement between theory 
and experiment is generally much less than 10 phons, which corresponds to loudnesses 
differing by a factor of 2. The disagreement for any particular intensity level may, of 
course, be reduced by slightly manipulating the values of the constants in equations (73) 
to (78). 

It is important to be able to relate the loudness of various pressure signatures to a 
single reference input so that the loudness of different signatures can be compared on a 
single loudness scale. It is most desirable to relate the loudness of sine waves and im- 
pulses in this manner. By virtue of Stevens’ law (eq. (56)) equally loud sine waves and 
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Figure 6. - Psychoacoustic filter curves (collectively fit). 



impulses should correspond to a constant intensity level difference (which might be zero) 
over the entire range of intensities for which the law is valid. Unfortunately, the pro- 
duction of perfect impulses is impossible. Alternatively, input signatures resembling 
impulses can be produced which will serve to illustrate how the sine wave and impulsive 
intensity levels corresponding to equal loudness levels are related. For example, 
Steudel (ref. 4) compared the amplitude for equal loudness of an exponentially decaying 
finite-amplitude impulse with that of a 1000-hertz sine wave. The initial rise time of 
the impulse was unspecified but very short in comparison with the 1-millisecond time 
constant of the exponential decay. For equal loudness levels the intensity level of the 
impulse was found to be approximately 10 decibels greater than that of the 1000-hertz 
sine wave over a wide range of intensities. Alternatively, for equal input intensities the 
loudness level of the 1000-hertz sine wave was 10 phons greater than that of the impulse. 
Steudel’ s results (minus the data points, for which the scatter is ±10 phons) are effec- 
tively exhibited in figure 7. By replacing his results for the sine wave by those of 
Stevens and Davis (adopted from Fletcher and Munson's tests) the curves in figure 7 have 
been based on scales with known reference values, whereas Steudel's original curves 
were not. Steudel's measurements extended only up to loudness levels of 100 phons. 

In figure 7, Steudel's curve for the exponentially decaying impulse is also compared 



Figure 7. - Relation between loudness level and intensity level for 1000-hertz 
sine wave and impulses. 
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with the predicted loudness of a true impulse according to the present theory. The rela- 
tion between intensity and loudness level for the impulse is given by equation (72), where 


T = 10 logO r 0 A r ) 2 


(79) 


because ir Q °c p Q and n r ^ p r> The loudness level L g is not quite proportional to the 
intensity level T because of the appearance of the ratio of sums + cog in equa- 
tion (72). As noted previously and o>g are functions of T. The curve shown is for 


£p = l w = 10 phons (80) 

L g = 16 phons (81) 

W 1 + ^ 2 ) = (82) 


where the selected reference used to determine + cogj is T = 0 decibels (cf. fig. 4). 

The agreement between the theoretical and experimental curves is excellent over the in- 
tensity level interval 55 to 115 decibels, where the upper value corresponds to the limit 
for which Steudel presented data. 


COMPARISON WITH OTHER THEORIES 


The present theory encompasses some existing theories and formulas for loudness 
and deviates from others. For example, among the latter category, Steudel (ref. 4) 
proposed a loudness equation based on the square of the impulse integral; that is, 




2 


max 


where P = P Q when t = t Q , and the integral is to be maximized. This integral correctly 
accounts for the loudness when P = P Q throughout the time interval t; specifically, it 
indicates no response. It also correctly accounts for the loudness of a step function as 
being proportional to the square of the pressure amplitude of the step. However, more 
generally, it predicts that all pressure signatures which possess the same amplitude and 
impulse over the integration period will be equally loud. Vast experience with sonic 
booms indicates that this result is incorrect. Moreover, the loudness of continuous, 
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statistically stationary, sounds is known to be proportional to the acoustic intensity, not 
to the impulse. Thus, Steudel’s equation must be incorrect. Biirck, Kotowski, and 
Lichte (ref. 5) criticized Steudel’s formula on other grounds. 

Biirck, Kotowski, and Lichte frequency analyzed impulsive sounds in a manner 
bearing some similarity to, but cruder than, that adopted herein. They assumed that 
the mathematical physics was linear and applied Fourier methods. The input signature 
was transformed into the frequency domain and mathematically filtered by an ideal 
broad-band filter whose cutoff frequencies were a function of the expected loudness. 

The loudness was effectively defined as the square root of the transmitted energy in the 
filter bandwidth. Zepler and Harel (ref. 6) independently repeated Biirck, Kotowski, 
and Lichte's frequency-domain approach but performed nonideal filtering numerically 
using experimental frequency response curves for humans, and defined loudness as 
herein. Errors, debated elsewhere (refs. 32 and 33), in Zepler and Harel’s experimen- 
tal procedure led to ideas which culminated in the present theory. 

Zepler and Harel's frequency- domain formulation was intended to apply to sonic 
booms, which are quasi-impulsive if the body producing the boom is sufficiently short. 
Their filtering process is numerical, but in the theory proposed herein it is expressed 
by a simple analytical function. Their formula for the "weighted energy density" is 
equivalent to equation (33b). Equation (33b) is valid if p(t) is initiated and effectively 
vanishes within the period tj. On the contrary, the amplitude-limited ramp function se- 
lected by Zepler and Harel possesses maximum amplitude, equal to the shock over- 
pressure, at time tj. Because the effective loudness-producing portion of the signature 
is concentrated in the time interval («tj) occupied by the shock, equation (33b) might 
conceivably be valid in this case also. However, no attempt will be made herein to test 
the validity of Zepler and Harel's application of equation (33b). 

A frequency- domain formulation for the annoyance of statistically stationary noise 
has been proposed by Jones (ref. 34). Jones' formula is basically equivalent to that of 
Zepler and Harel for loudness and, hence, to equation (33b). Presumably equation (29b), 
rather than equation (33b), is the correct equation for steady noise. The two equations 
differ in that equation (33b) involves the ordinary pressure spectrum whereas equation 
(29b) involves the running pressure spectra associated with the initiation and termination 
of the auditory integration period. If the integration process is extended from the finite 
interval tj («0. 2 sec) to the infinite interval (-<*>, °°) equation (29b) reduces to equation 
(33b). The assumption that the sound intensity spectrum | P(u>) | 2 over all time is 
equivalent to the difference of running intensity spectra |P(o>,t) \* over the period tj 
is tacit in all noise tests, but apparently has never been verified. 

Unlike the present analysis, those of Zepler and Harel (ref. 6) and Jones (ref. 34) 
do not attempt to link the mathematics to operations which occur within the auditory 
system. 


32 


CONCLUDING REMARKS 


A few general remarks should be made regarding implications of present loudness 
theories. 

The claimed successes of the frequency-domain theories of Zepler and Harel for 
impulsive sonic booms and Jones for continuous noise indicate that the validity of the 
present theory may be quite general. In particular, Jones' results indicate that the 
existing multitude of schemes and units for evaluating the subjective aspects of noise 
such as loudness and annoyance may be irrelevant. It is suggested from the present 
theory that Jones’ introduction of still another noise unit, "perceived sound level, " 
may also be unnecessary. Stevens' loudness and loudness level units, sones and phons, 
respectively, appear more than sufficient. 

An important question in auditory studies concerns the extent to which the auditory 
system is nonlinear. The present theory, in conjunction with the results of Zepler and 
Harel and particularly those of Jones, substantiates the conclusion of Biirck, Kotowski, 
and Lichte that the auditory system, at least that part preceding the brain, is effectively 
linear with regard to loudness. The nonlinear part of the system is involved in the 
psychoacoustic conversion which occurs in the brain. This does not mean that the action 
of the ear and nervous system is completely linear, but only that possible nonlinearities 
have a negligible effect on loudness. 

Although Zwislocki's theory (ref. 7) of temporal auditory summation is incorporated 
in the present theory, the theoretical constants Wj and Wg (in the present notation) are 
shown herein to be response cutoff frequencies and have values (fig. 5) which differ 
greatly from those given by Zwislocki. The difference in the values of the constants may 
result simply from the fact that Zwislocki's values were calculated from Galambos' mea- 
surements of stapedius muscle contractions in response to electric shocks to the medulla 
rather than auditory hair cell electrical outputs from pressure disturbances. 

Finally, it should be recognized that the present theory obviously does not include 
all known effects on loudness. For example, auditory fatigue (ref. 3, p. 272f) and the 
"cocktail party" effect are two phenomena not incorporated in the present theory. 

Lewis Research Center, 

National Aeronautics and Space Administration 
Cleveland, Ohio, June 12, 1969, 

129-01-07-06-22. 
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APPENDIX A 


SYMBOLS 


A 

c 

F(w) 

f(t) 

G o 

G(p) 

H (<y) 

H(w,t) 

h(t) 

I 


coefficient 
speed of sound 
function of frequency 
function of time 

reference sound pressure function 
sound pressure function 
frequency response 
"running" frequency response 
impulse response 


Mt) — p(t - r)dr 

at 


i 

J(o>) electric current spectrum 

J(u>,t) "running" electric current spectrum 

j(t) time- dependent electric current 

k coefficient in Stevens' psychophysical law 

kj coefficient in Fechner’s psychophysical law 

k Q loudness coefficient 

L loudness level, phons (ref. 1000-Hz sine wave at intensity 10 W cm'j 

L m loudness level corresponding to vanishing intensity level 

L m ( w m ^ loudness level at peak response frequency corresponding to vanishing inten- 

sity level 

L Q loudness level of sine wave pressure input 

L. impulse loudness level corresponding to vanishing intensity level 

°o 

loudness, sone (loudness of 1000-Hz sine wave 40 db above threshold refer- 
ence intensity) 
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In 

log 

Z P 

*« 

m 

n 

p(«) 

p(«,t) 

p 

p 0 

p(t) 

Q 

Q(w) 

R 

T 

3- 

t,t 

*0 

tl 

v n 

y 

a 

n 

A 

Aw 

5 

V 

e 

0(t) 


natural logarithm 
common logarithm 

loudness level pressure coefficient, phons 
loudness level frequency coefficient, phons/decade 
exponent in Stevens’ psychophysical law 
integer 

pressure spectrum 

"running” pressure spectrum 

acoustic pressure 

input pressure amplitude at the ear 

pressure time history, or signature 

reference frequency function 

frequency function 

electrical resistance 

real part 

duration of impulse 
time duration 
time 

input initiation time 
effective termination time of input 
integration period, auditory integration period 
velocity component normal to control surface 
angular frequency (dummy variable) 

Fourier coefficient 
phase angle 

radian frequency increment 
unit impulse function 
t - r 

^oh / 277 

unit step function 


tan - * (w/coj) 

©2 tan -1 (w/o^) 

k units conversion factor relating acoustic pressure and electric current 

4 T - T 

II electrical power 

n electrical power output from hair cells 

V 

input impulse 
reference impulse 
p mass density 

r, r time (dummy variables) 

T acoustic intensity level, db (ref. 10"*® W cm”^) 

<p magnitude of physical stimulus in psychophysical law 

acoustic intensity 

4/ psychological magnitude in psychophysical law 

c v , w' angular frequency 

o> m angular frequency at which auditory frequency response peaks for a given 

input amplitude 

u>q auditory input frequency 

auditory lower cutoff frequency 
o >2 auditory upper cutoff frequency 

— Fourier transform 

Subscripts: 

c complete auditory system 

r reference 

Superscripts: 

* complex conjugate 

“ infinite time average 

~ finite time average 
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APPENDIX B 


RUNNING CURRENT SPECTRUM 

In order to express the running current spectrum J(w, t) as a function of the input 
sound pressure p(t), or its spectrum P(a>, t), note that 

J(o>,t) — * 0(t - t)j(t) (18) 

Because h(t) is causal (ref. 16, p. 85), equations (7) may be written as 


✓•CO 

j(r) = k / 0(r)h(r)p(r- r)dr 

»/. co 


(7a') 


j(r) = k f Q(t- t)1i(t- r)p(T)d' 

oO 


(7b') 


Using equation (7b'), 


J( 


✓•oo . s* oo 

o>, t) = k I 0(t - r)e~ ic0T dr / 0(t- T)h(r- T)p(T)dT 

OO %/_oo 


✓•CO ✓•OO 

= k I p(r)dT / 0(t - t)6(t - r)h(r- r)e -ia,T d r 

»/_ oo «✓_ OO 


Let 


Then, 


£ = t - r 


J(w,t) = k f p(t)e" ia,T dr /* 0(t - t- £)0(S)h(|)e 1ClJ ^ d| 

•A- oo A. oo 




0(t - r)H(co,t - T)p(r)e -1WT dr 


(Bl) 


(20a') 


An alternative expression for J(co, t) is obtained using equation (7a’). Thus, 
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where 


Finally, 


fOO . s»CO 

J(w,t) = k / e(t - r)e" iWT dr / 0 (t)1i(t)p(t- r)dT 

»/. oo •/_ oo 

✓•oo r* co 

- k f 0 (r)h(r)dr / 0(t - r)p(r- T)e" iWT dT 

%/—oo 

- k f 6 >(r)h(r)dT f 0 ( 77 - £)p(£)e” ia, ^ + ^ d| 

•S— 00 */_ CO 


77 = t - T 


J(«, t) = K 



e(r)h(r)P(w,t 



dT 


(B2) 


(20b') 
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APPENDIX C 


FREQUENCY-DOMAIN REPRESENTATION OF ELECTRIC POWER 


In the time domain the electric power corresponding to the information output of the 
auditory nerve endings is given by 

*t 


n(t) = Rtj 


/ l^ i<T> 

J t-t. 


dr 


or 


n(t) = Rtj 


/ 


|— [6(t - r)j(r)] 
dr 


dr 


[6(t - ^ - r)j(r)] 


dT 


( 10 ) 


( 10 ’) 


where the impulsive transients contained in equation (10'), but not in equation (10), are 
unphysical and are to be neglected. The Fourier transform of the first term in the inte- 
grand in equation (10') is given by 


— [0(t - r)j(r)] — iu>J(w,t) 
dr 


( 24 ) 


Hence, the frequency- domain representation of the integral of the first term in the inte- 
grand in equation (10') is 


/ 


— [fl(t- r)j(r)] 
dr 


dr 

\2 /•oo /*°o . S'OO . t 

— ) / dr / dco- t)e 1Ct>r I doo' • co T J*(co T , t)e~ 1C0 7 

2 77 / oo J- co J-. oo 

— ^ f dw • wJ(o>, t) f du>' . u> , J*(u) , ,t) f dT • e _i ^ w -W ) T 

2 ?r J tS-CO J—OO oo 

= f — \ f dco • «J(w, t) f do)’ . - a>) 

V 27 t) J- 00 •/-oo 

= ^ y" aAl(a>, t)J*(a>, t)da> 
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or 



- [Bit - r)j(r)] 


2 

dr = 



a> 2 | J(u>,t)| 2 dco 


(Cl) 


By repeating this process with respect to the second term in the integrand in equation 
(10’) and then introducing the two results into equation (10'), it follows that 


R(t) = 




| J(w,t - tj) 1 2 > 2 dw 


which is the desired frequency- domain representation for n(t). 


( 26 ) 
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APPENDIX D 


POWER OUTPUT FOR VARIOUS INPUTS 
Pure Tone 

Time- domain analysis. - Assume that 


p(t) = p Q cos(u> 0 t - A) 


(59) 

, , 1 f i(w 0 t-A) -i(a. 0 t-A)l 

+e j 


(Dl) 

For abbreviation, let 



/•CO 

I = / h(r) — p(t - T)dT 

J.O O dT 


(D2) 

so that equation (27a) for the power becomes 



ff(t) = K 2 Rtj /** 1 1 1 2 dT 
^t-tj 


(D3) 

By introducing the right-hand side of equation (Dl) in equation (D2), 

it results that 


iu>,m T i(w n r-A) -i(w n T-A) - 

I • 0 ° H(w 0 )e 0 - H(-« 0 )e 0 

2 _ 


(D4) 

so that 



|l| 2 - W ° P ° f|H(co 0 ) | 2 - ^0H 2 (cc o )e 2l(W ° T A) 
2 L — 


(D5) 

because 



|h(-oj 0 ) | 2 = | H(o> 0 ) | 2 


(D6) 
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H*(co q ) = H(-w 0 ) 


(D7) 


H*(-Wq) = H(w 0 ) (D8) 

Equation (D6) is valid for a symmetrical filter. The last two equations are valid if h(t) 
is real. When equation (D5) is introduced in equation (D3), it follows that 


n(t) = 


2_, 2 2 

K RtjP 0 W Q 


|H(o» 0 )( 2 t x - Re -J_H 2 (co 0 )e 
2ico 0 


2i(w 0 t-A)^ -21^0^ 


(60) 


Frequency- domain analysis. - With p(t) given by equation (59) and I = ~ j(r) 
given by equation (D4), let I be expanded as a Fourier series, 


I = 



27rinT/t 1 

V 


n=-°° 


where 


a 


n 



Ie 1 


dr 


From equation (D9) it follows that 



(D9) 


(DIO) 


(Dll) 


because a n is independent of r. Hence, considering equation (D3), the power is given 

by 


oo 

ff(t)=« 2 Rtj ^2 l“nl 2 
n=-°° 


(D12) 


From equations (D4) and (DIO), 
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% = 2 “oPo 


’ , , i(w 0 t-A) / -i^i 

H(u> 0 )e (l - e 


i -277int/t 1 
• (Wgtj - 27 m)" e 


-i(w n t-A)/ io> n t t \ 
H(-co 0 )e 0 (!- e J 


j -2uint/t 1 
(w Q ti + 27m) e 1 


(D13) 


Hence, 



oO 

• y] [<■* _ 2™)" 2 + (^oH + 27m )~ 2 j 

n--°° J 


+ 


H 2 (co 0 )e 


2i(w 0 t-A) 



-ico 

e 


0*1 


2 


2 -2i(w 0 t-A)/ iWptjX 

+ H 2 (-w 0 )e 0 (l - e 0 



(D14) 


Applying the formula (ref. 35) 




2 

CSC 7T0 


(D15) 


with 


e = 


“oh 


2 ? T 


(Die) 
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there results 


[(Wgtj - 27m)" 2 + (o^tj + 2jm)~ 2 ] = 2 ^1 - e 0 ^ fl - e 0 ^ (D17) 

n=-°° ' 


Also, using the formula (ref. 36) 


v-1 


n=l 


(e 2 - n 2 ) = i 6~ 2 (tt6 cot tt6 - 1) 

' 2 


(D18) 


it follows that 




(D19) 


By introducing the sums given by equations (D17) and (D19) in equation (D14) and then 
introducing the result in equation (D12), equation (60) is obtained. 


Impulse 


Time- domain analysis. - Assume that 


P(t) = 7T 0 5(t) 


(69) 


where it is the impulse. The electric power is given by equation (D3), where 


■L 


at 


[h(t - r)]p(r)dT 


(D20) 


By introducing equation (69) in equation (D20), it follows that 


I= s oI h(t)= % h c (t > 
at 


(D21) 


where h(t) is given by equation (46). Hence, 
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(D22) 


ff(t) 


= K 2 Rt t P 


l| 2 dT 


n(t)=K 2 RtY /** 1 1 1 2 dr + f° 1 1 1 2 dr^ 
I y* / 0 * / t-t 1 t 


(D23) 


where the second integral in the last expression must vanish if t < t ^ because h(t) is 
causal. By performing the indicated integrations the results are 


n(o<t< tj = A Rt Y KAn ° \ 
2 > 2 -o>J 


n(t < o) = o 


(D24a) 


« 2 ( 

-2w 0 t 
1 - e 2 

) + oM 

-2 coA 
1 - e 1 

\ 4(0^ / 

j _ e -(«!+« 2 )t\" 




< 

/ + o>2 ' 

k / 


(D24b) 


< t) = - Rt 1 


/cAvr 


2 \u> 2 - co ly 


» 2 e 


-2co 2 t/ 2u>„t 


/ 2co 0 t t \ -2oo A/ 2u» 1 t 1 \ 

(e 2 1 - ll + Wje Me 1 1 - l) 


4o) 1 o) 2 -(Wj+cOgH 


M + w 2 


((0 1 +(D«)t 1 'll 
? 1 2 1 - 1 (D24c) 


These three equations may be combined to yield equation (70). 

Because ojp co 2 » 1/tj, it follows from equation (D24b), or (D24c), that, when 
t = t v 


1%) 



+ 0)j - 


4aJ l aJ 2 \ 

"l + "2/ 



(kAtt 0 ) 2 
"l + "2 


(71) 


Frequency-domain analysis . - Because p(t) is impulsive and h(t) is quasi- 
impulsive, equation (53) may be used to evaluate n. Thus, by virtue of equations (11) 
and (69), 
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ii mini i 


mill n 


P(w) = 7 r Q 


Hence, equation (53) becomes 


n(tj) « — Rt 1 (/cA7r 0 ) i! 
27 T 



CO 


2 


dco 



+ CO, 



4 ) 


which results in (ref. 37) 


5%) 



(kAt t q ) 2 
W 1 + w 2 


in agreement with the result from the time- domain analysis. 


(D25) 


(71) 
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